10 



MSFT NO. 304843.01 
Attorney Docket NO. MCS-0404)3 



S0FTVVARE4MPLEMEMTEDTRANS^^^^^ 



by 

Lifeng Wang 

Ke Deng 
Baining Guo 
and 

Joshua William Buckman 



TECHNlCALFlitfi 

The present invent.on relates ,n enera g ^^^^^^ 
and more particularly to a sof-ar^^^^^^^^^ ^^^^^ ,,,, 

coordinates system for vector operations. 

BACKGRQUNDOFTl^^ 

v.n«i HD^ enabled embedded platforms 
T.ein,po..nceonhree-ainn«..o-P^^ 

30 has become increasingly '"P— J^^^ 3^.^p .oxes, Web pads 
environments In products -'^^^'^^^^^ device) to 

.nan^bilecomputlngaev^J.-"^^^^^^^^^^^ „.rs con«nue to 
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,0 converge w«h their desktop equ.,alents. Tt,us, the 
rendering is vita, in todays embedded systems. 

O...emorepopu,ar30rend.ngs.hd^-^^^^^ 

Di,.ct3D by Microsoft® . , 30 obie* Direct3D provide 

■,„.e,,ace(API).or-t.anipu.a«ngandd«p,a.ng^^^^^^^^^ 

programme.anddeve.pers«^waV.^^^^^^ 

utilize whatever graphics „„,ering in desktop 

Oireot30doesanexce,,entiob,nsup^^^^-^ 

r;-zrc:=:r:::::p-ng.ni.opos,. 

5 desidop systems use <ioaW-po.nt "P^'^*"' '° ^ ^ p^erful 



20 platforms 



25 



30 



graphics hardware and processors based on executKin 

processing pip^inesbased on .oat,ng-p2^^^^^ 

by a GPU (such as is avaiiabie ^^^o^^^^ ,<,«„g.point sottware 

-'""^'"i:rs::cr;te,argeamo^ 
'r:rrr:r:— ^^^^^^ 
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• lompnted T&L pipeline that is fast, efficient. 

requires little memory and has a smau 
embedded platforms. 

as n»* compu^ng '^'^^^J^'^^^^^,^ ,pecif,oaUy to ioc^ase efficiency 
,„.udecf number o,.ea,u.^^a^^^^^^^^^ 
I and performance on embedded dev,oes. P 

are designed for desW systems hav,ng sing unit (CPU). In 

---^tu^—P ^^^^^^^^^ 

addition, the CPUS Of these desKtop y embedded devices 

s— -"--::^r:roPr:t.nasing,estream,ine 
5 often lack coprocessors and GPUs, and na 

andanot designed for parallel operattons. 

..etrans,om,and lighting modu,eand.P*^n*d.-.^^^^^ 
.reamlinebrancbedarc^necturetbatallows^^^^^ 

20 embedded device and saves --^-^^2^ ,^ 3,,, auplicat^n In 
,,„3,,averte.c..jats^^^^^^^ 

pressing of the vert^es. For e«,r^P . ^ 
been lit, then those vertices are sent to the vert 
p.cessingand*erebyavolddup«P~^^^ 

as i-'-"-r'^;::ra"rnanembeddeddevlce. 
additional hardware that is at a premi 

transfon. and ilgh«ng mod.ie and P^P^^ ^^^^^^^^^ 
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, — ..P— use, 

wangle Mo^ed. "-•*-'^^7^":,:;n;ewfrusU.n,*p.ane. «so. 
determining whether a vertex rs outsrde of one 

then the vertex is culled. 

, .etrane^andilgh^ng™.. — ^ 

oon,pu«ng colorfton, the -^ices and a rans, ^^^^^ ^ 

..nsformation nroduie, for "-f-^ ,,at Involves 

„,.,ngn,odu,ealsoino.d.— 

interpolation of the color and the texfc. 3,^ system 

,5 diPPin^modulelsdeslgnedfornon.^^ ^^,^,,3t^ 
(NHCS)«lntoper^o^j2^™--^^^^^,^^^^^^^^ 

Ccationons^eofthe transform and light,ng module. 



30 
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oon.a*gva*es. The .n<.e.ng data ,s,n.^^^^^^^ 

„en is — ed ^ n^del space ,nto * J- ^ ^^^^^ 

..examined p.o,to,ig.in«.de.en™r^^v*^^^^^^^^^^ 

vertexoache as needed. ^'^^ ^^"""^ . This arehitectu« avoids 

.^esasinoiest— --^^^ 

acriarr:;:..^^^^^^^^^^^^ 

coordinates. 

aetalied description of the invention, taKen ' 

drawings, v*ich iilusf^te, by wa, o. example, «.e pnn.pl 

20 invention. 

• .hi^h r.wp reference numbers represent 
Referring now to the drawings m wh.ch like referen 

corresponding parts throughout: 

on the driver module . ^ ^^3^^^ environment 
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and lighting module shown in FiG. 3. ^ 

shown in FiG. 4. ,„„rfraiina the operation of the cuiiing 

FIG. 6 is a detaiied flow diagram illustrating the op 

module shown in FIG. 4. ^ and 

FIG 7 is a working example of the transior 

pipeline as Is shown for iilustraSve P'-'P"-^ °* ^, ^ ^tore culling 

, FIG. 8 illustrates an exemplaiv implementation ot 

'''"%3S.9/.and9Billust.teanexemplaryimplementat.nofnormal.ed 
vectors in a D3DM Phong Model. 

— =r:=:::escope..ep.sentin.^^^^ 

GeneiaLSyaDflSa ,,„rt^rts (such as DirectSD) designed for 
Typical graphic rendenng lighting 
.esKtopsvstemsusefloatlng-poin^^--;^^^^ 



platforms. 

30 
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The transform and :igMin9 module PJ^^ ^ ,„es no. .equi. 

,„creasee«o:enovandpeKorma„-^-^^ 

powerful pmcessors or graphics hardw ^ ^^^^ 

:ndp:p.«-«.esa.n.e.rea^^^^^^^ 

efficient processing on a CPU f a^ ^ ve-trces 

^. Tt-isarchUectureistacrlrtatedb^^-o' The transton. and 

as neededto avoid duplication >n fisting T&Ltechnip^^^ as.oi.ows. 
, ,ig«a.L)moduieandpip..ne*^^^ 
Pirst, «,e m module and p,^^ --^ P;^^^^ 

are multiple pipeline. ^^ie existing T&L techniques 

pressing by culling vertices be^ 1* "9^ ^ ,,,,,3 a 

t^nsfom, and light ail vertices. Th«)>e 
,5 software-implemented vertex ^^Icache be^«eentheT&L layer and 

r^n— randpractica,.oruseonembeddeddev.es. 

P10.tlsab,ocKdiagrami,,us,ratingagene.ove.^^^^^^ 

" graphics.nde.ngsvstem1«^^^^^^^ 
aisclosedhereiathes^^ OJ^^^^^^^ 

such as a mobile computng "^^^ . J processed rendenng 

«„denng data 120. processesthe data Mja^^^ 

,S aatataosuitabietorrendedngb^^^^^^ 
,enderingdata120typicaliyis.nafloa«ngp 

. p,G 1 .be NHCS graphics rendering system 100 includes a 
AS shown in FIG. 1, the Nno ..„rf,„,APO module 150, and a 
.sKmodule140,an application progran,..*-^ 

30 .rivermoduleteo. ^•'«'^=^'~:'te;ritlnto a desired flxed-poi^ 
,oa«ng.polnt.om,a.andconverts.he data 120 
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120 in a ^^^^"^^^f:^^,^^ data then is sen, to the APi 

c:::ri.'s:tc:r.^e...sto.in.t.eco„.e.eaa 

module 1 50. The An mo „^^anri buffer for the driver module 

to be rendered by a rendering engine. 

exemplary operating environment 200. 

30 The transfon. and lighting module and method are operation^it^ 

™Jlthergene..p.rposeorspec.,purposeoompu.ngsvstem 
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environment or configurations. Examples of well known oompu«ng s^ms, 
en,.mnn,ents, and/or conflgurafcns .ha. may be suable ,or use w*h *e 
.ransom, and lighting module and me.hod include, but are no. I™'«^J°. 
personal computers, server c»,r,pu.ers, hand-held, laptop or mob,le computer 
5 communications devices such as eel, phones and PDA's. muiSprocessor 
systems, microprocessor-based sys.ems, set top boxes, programmable 
consumer electronics, networi. PCs. minicomputers, malnfmme computers, 
distributed computing environments mat include any of the above systems or 
devices, and the like. 

'° The transfom, and iighUng module and method may be described in *e 

general con.ext of computer^xecutable instn-dons, such as program modules, 
being executed by a computer. Generally, program modules .nclude rout, es, 
p Jams, obiects. components, data struc»,res, etc., *a. perfom, pa^c^a^ 

15 L sorimplemen.particularabstractda.atypes. "n may a.o be 
p^cticedindistributedcomputingenvlronmentswheretasRsareperfo™^^^ 

Lote processing devices .hat are linked through a commun.a ton net^ort. In 
adJutedcomputingenvironment,programmodulesmaybe^o.t^^^^^^^^ 

toca, and remote computer storage media including memory storage devK^s. 
20 Z Lr^nce to FIG. 2, an exemplary system for implementing .he ...nsf.^ 
and l,h«ng module and me*od confined on the NHCS ^raP- r« 
system 100 includes a generai-purpose computing device ,n the fom. of a 
Iputer 210 («« computer 210 is an example of «,e compu^ng devrce 110 
shown in FIG. 1). 

Componems of .he compu.er 210 may include, but are not limited to. a 
posing nH 220, a system memory 230, and a system bus 221 that couples 
^Tssiem components including.hesys.em memory U>«,eprocess,ngun. 

220. The sys.em bus 221 may be any of several Wpes of bus sfru^-^s 
30 indudlng a memory ..s or memory «>ntroller, a peripheral bus, an a ca, bus 
using any of a variety of bus architecU.res. By way of example, and not 
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computer readable nned,a can be any a^ ^^^^ 

•^^-"^"'^""jirt'oC'-noUl— compter 
and non-removable media. Byway 3^53 gnd communication 



modules or o»er data. 

15 



=a„ includes but is not limited to, RAM, ROM. 
computer storage "-^^^ oD-ROM. digHa, versa«le 

EEPROM. flash memory or other <^'^ tape, 
«s(OVO,orotheroptio.~^^^^^^^^ 

20 ^^--^-^^tm— ionmed,atypicallyemb«.ies 

by the computer 210. uommui ^^H.iles or other data in a 

:«einst.ctions,aaUst.~^-— ^^^^ 

modulated data signal such as a earner wave 
includes any infonration delivery media. 

Notethatthetem,..m.ula.edda.s.na,-^^r.a^^^^ 
^.o,^oharac.eristicssetorchanged.suc.a™^^^^^ 

--r^ruiri^nllrdirect^lredconnecuon, 
media includes wired med« such as aw media. 
30 and wireless media such as acoustic, RF, mfrared 
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C„„.a^s.a„vo..e3.ve.o..a,so..*.ea*n.escopeo, 
computer readable media. 

wo includes computer storage media In the km of 
The system memory 230 ,nc ude^ P ^^^^^ ,3, ^ 

.ola.ileand.rnonvo,a«le— ^^^^^ 

random access memory (RAM) • irfomiation between elements 

containing the basic routines that MP ^ 

RAM 232 typically «>ntams data and/or p g ^^^^^ 
, access,b..oand,orp^entlyhe,ng^^^^^^^^^^^^ 

application programs 235. otnerp a 
20 ROM or other optical media. 

oKi«. volatile/nonvolatile computer storage 

Otnerremova^— 
media that can be used ,n the ^-^^^ ^^3, versatile 

„„.l,mited.o,magnetiotapeca^-;^^^^ 
25 disks, digital video tape, soIk) state RA^ ^^^^^^^ ^ 

harddisKddve241ls.ypica,lyconne<^^^^«J^^^^ 
^movable memory interface such as .nterfa^ m ^ ^ 

andopBcaldiskdriveaSSaretypicallycon-ted^-^^ 
.movable memory interface, such as mterface 250. 



30 
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s^uctures, ^ TL,-^ operating system 244, 

--^'^ *:::r ::r ana - ^ 

ope.«nfl system 234, application P-^^-^^^^ 245, o..e, 

and program data 237. Operating ^^^^^'^ „,„.ers t,ere to 

p^gram modules 246, and program <> ™ ^ ^ ^ ^ 
, Istrate that, at a minimum, they are " » ' ^ ^ as 

oommandsandin,on.a«onin.thecor^P^er^^^^^^ 

a keyboard 262 and pointing device 261 , commo y 



tracl(ball or touch pad. 



15 



20 



p^cessing un* 220 through a user '^^'^^^^^^ ^ ^3 structures, 
system .H.S 221 , but may be connected ^^^^.n^ serial bus (USB), 
such as, for exam^e, a pa*l po ■ 9 - ^„ bus 

.™n«or291orothertypeo,disp.ay-^^^^^^ 

221 Via an inteKace, such as a vKieo — ^ ^^^^ ^^^^^^^ 

computers may also in^udeotherpen^ o^^ ^^^^,^^^^^^^^^^ 
297 and printer 296, which may be connected thro g 



25 interface 295. 



30 



.«h.mrkP(i environment using logical 
The computer 210 may operate ,n 230. 
„„„ect.ns.ooneormoreremotecompu.r^.^^^^^^^^^^ 

The remote computer 280 may be a ^ ,„Cudes 

netwoH.PC.apeerdeviceorothercommon — 

„any or all of the elements descnbed above relative 
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J • « ofti ha<i been illustrated in FIG. 2. The 

and a «kle area netwo* (WAN) 273, but y ^.^^^ „„pu,er 

„e,«orKing environments are commonplace ,n offices, 
networks, intranets and the internet. 

When used ,n a LAN net«orKin, ^'^^'^JZ^^Z' In 
connect, to the LAN 2.1 tnrou,n a ,„,„des a 

"sea in awAN J 

, modem272orott«rn,eans^r^^^^^^^^^^^ 

such as the Internet. 260, or other 

oonnect«.tothesvstemus^j2"^^^^^^^^^ 

app^priate mechanism. Inane^ „-,rtbns thereof, may be stored In the 

depicted rels«ve ^ ^ "^^^^ ^ exam^e, and not«tat.n, PIO. 2 
5 remote memory storage device. ^>'"'^ , „^n«>^ device 281. 

:::r:::-rgacommun,cat^^^ 



20 



be used 

III, nyntfTTr f^^mponents 



25 



30 



,„p,ementationo,theNHOSg,ap.«^^^^^^^^ 

*'-rro:r:;e— .Microsot^C^^^^^^^^^^ 
implemented in a Direcwu n ^^^^ ^ 

Redmond. Washington, develop^^^^^^^ 

rendering standard. Trad,t,ona:iy, D3D suppo ^^^^ 

pe,sonaloompu.er(PC)applications.Th PCsy^^^^^^ 

and GPUs and can support intensive araph cs ^nd- g ^ ^^^^ 

,e.ces,suchasmobile— ^^^^ 

powerful processing units. The NHCS grapn 
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■ »h,«sthe use of D3D on mobile com^ttng devices (D3DM). 
disciosed herein enables the use software-based 
TheNHOSgraphicsrenderingsvs^a— -^^^^^^^^ 

,3nsforn,andiigh«n,n»du,eHa.^;-^^^^^^^^^^ 
corresponding graphics funcSons. ^ ^ resources 

5 on mobile computing devices and makes effluent use 
available on mobile computing devices. 

T ;l I aP code provides integra««>n with the ope..ng 
10 straightfoniran). Thus, the An ^^3, 

:::r:r::rrm:::=^^^^ 

- .edosignofOSOMisbasedonthefa.^^^^^^^^^ 

.rmsofprlm^ves. '"^J-r:::::: Klines meet. Theve.e. 
vertexes (or vertices). A vertex .s P ^^.^^ 
oarHesag^tof information. For^^^^^^^^^ 
20 coo^ates and weight. Ratals commonly coded in 

«,m, of a d«ft.se and a ^P-'^'" ^^ex also contains a 

.RGBA- fomtat (for red, green, - * f coordinates 
„ormal,«.evector.hatlso*ogonaltous^^^a^^^^^^^ 

-'--"''^"r:^:rr::in:tex.ureisappiiedtothe 

25 several texture cooniinates ,n case m ^^^^^ .^,^^3,„„ 

large amount of information. 

. . «,t*»v buffers The data then is 
030Mloadsthisdataforthevertex*ve^-bu^^^^^^ 

processed by the transform and l^htmg (m) modulew 



30 
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K The transform and lighting module and method 
oolorvaluesforaframebuffer. ^^^^ "^"2, bv D3DM The mathematScallibiary is 
contain a mathematical library that ,s us^ by D3DM^ ^ 
„sed to implementthetransfom, and ^^'"^l „n„i„g, 
,he features of the mobile computing dev«e on wh«:h 
ftus achieving maximum drawing perfom^nce. 

.efer.ngtoP.O.3,thetas.moduleM0.c^.^^^^^^^^^ 
,„slator300,anapplication305,and«.^^^^^^^^ 

, afixed-pointfom^toraNHCSflxedp „„„.,bra-y arKi translator 

300 converts the data 310 and pe ^ 
•----V:.~v:rar Thepreliminarymathema^ 

, :::rra:r:t:re:e«n.- 



20 



25 



30 



...lmoduletSOc.atesb.e-^^^^^^^^^^ 
preparing the data for the drh,er <^^^'^, venex 
ndex buffer 315, for storing indices, and a ^^^'^^ ^^.^^ ,3 ^ 

„ion. rrexb eachmdexls 

an index, indices are used to relieve av ^^^^ ^^^^^ 

an offset in the current vertex buffer o the d^<o^ ^ 
shaHngo.vertexdatabe*»eenm*leva^c^anda^^^ 
^.,eo,ve.ices v.hen two neigh onn^.^^^^^^^^ 

™- -0 - — ^^^^^^^^^^ The API n»du. 150 

— «'"^'^'":r:;plIgesthecommands325andp.vides 

:r;rr::::-se:..or.e.^^^^^^ 
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for the driver module 160. A command buffer 
:::r,^rm.ein.sen...ed.ermcdu.1S0. 
340 stores the wrapper 3.it) pno^ 

* th<.rfl<?ter In addition, the driver 
The driver module 160 prepares aatafo^-^-^^ 

„odu,e160prepa,.«.eda.forusebva.n«^^^ 

data Is translated Into the ^^^ l^^^^J,,, ar.er modute 160 
,3.v.are and causes pa.cular pnm^--^^^^^^ 



raster 
W. 



, ,„ general, the transform a .^^^ 

.ranched pipeline thattaKes '-^'^^ ^ .^^ ,,eamllne branched 
output data containing 2D screen »c*a^ ^ ^ CPUs 

.pellnelspar.lculariy^e-sue'o'.^^^^^^^^^^ 
contained on embedded devices. The ^ng <^„p„,atlonal time 

,0 archnectureofthetransformandl^-^;^^^^^^ „^ 
.„ag.*inc.asese.f.c,enc.Trans^^^^^^^^ 

and other systems having ^^'^^Z processing using a low-power CPa 
single streamline branched P'P*^J ^ „„,„p,e streamlines) and 

W.aOPU,mprocessln«.s^VP— 
25 branches are not necessary, in fact, bra 
down the processing and computabon. 
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aa«400hasbeenp.evious.y.ran— . Hhev^rte^^^^^ 

vertex informatton has not been prevK, y ,^„sfom,ation 
420 perfoms me transfom,at«>n on the vertex »i ^^^^ 
^„,e«0.— thevertex*— ^^^^^^ 
This transfonnation is performed usmg NHCS 
detailed l)elow. 

41 5 is an important part of the transfom, and lighting 
, The vertex cache ^ ,,„g,e streamline 

module 346. The vertex cache 415 and 
branched pipeline archfteCure by ^^^^^^^ ,„ 3,,„on, the vertex 

„.,er data to be stored «hi.e '^'';^^J^2l ^a. has already 
oache415providess.orage.orve ex,— --^^ ^^^^^^^^ 

,5 been processed by a certain ™cd.e^ J- ^ ,00 and 

^bile (D3DM) is implemented ,n the NHOT ^^^^ ^^^^ 

«,e vertex cache 41 5 is software implemen ^'" ^^^;^ ^^^^^ 
traditional DiiectSD implementations use na 

memory and 

The software-implemented vertex cache 41 5 saves p 
20 processing power. 

. ,» 425 IS used to perfomi culling of the transfomied vertex 
A culling module 425 b used to P 
Wcm,ationpdortotheinfonna«onbei,,gpro^^^^^^ 

25 transfom. and hghtmg module « P .o^putatlonaliy expensive) 

«,eligh«ngn»duie430el,m,na^^sunne»^^^^^^^^^ 

processing by the lighting module 430. Th s P ^ ^ 

^ore ttre lighting module 430 reduces ^"^^J^ 
^ lighting module and saves both fme and processing 



30 
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400, ..s in— was s.ored ^J^^'^ ^^^^^ 



module 420. 



rr^a vertex may be discarded by the culling 
AS explained in detail below, some ^^"^^^^^^ stained 

vertex. A«er«^eve.ex,s UHey^^^^^^ 
transfotmatlon module 440. The textu g 
20 440 computes texture coordinates and transforms 
generation. 

,.^..no.edt.a...e.acemento..e^o.^^^^^^^^ 

,,„„orm and lighting pip^ine ,„ „e transfonn and 

25 lighting and after texture generatron and ^ ^ 

„h.ing pipeline o, disposed ^^^'^^^ 430 and before the texture 

03DM and Is pos«loned betb-e the ligbt^ ^^^^ 

generation and transformation module 44a msp ^ 
Uan^~mputa«onaltlmeb.ause^^^^^^^^^ 
30 computationally intensive part of the transfo 
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--^^^°'*^^t»Xn'-a«e.*«ng and .e.ure generate. 
View fmstum cl.pp.ng .s appl^d ^ ,he color and texture 

and ,rans,om,a.ion because it -Cves*^;^ ,hCS flxed- 

eooraina..Tnev-.ew.rus.n, .pp.ngn^^^^^^^^^^ 
point operations and processes m ^ vie« 

«^nio A-^B is divided into vertex 
Ti^edatae^^ngtheviewtrustumm^u, 3^'^^ ^^^^^^^^^^ 

Wor^ationandind--. ^^^^^^l^^- The output data o. the 
MSandtheindicesarese^to^^^^^^^^ 
15 transform and lighting module 345 .s^u 

ormon«orofanemt«ddeddevlce. 

• . of SinateStreaoyneBS^^ 
The transfom, and lighting module 345 .nc ^ 
20 branchedpiP*e*a.isoP«n.^-d^'--^^^^^^ 

pla«om,s. FIG. 5 Is a genera flow digram ^^^^ ^ 

streamline branched pipeline (or m W oH e 

345 Shown in FIG. 4. The pipeline '■^'"^^rrl a plurality o, vedlces). 
space (box 500). The rendenng data '^'^'^^' ^^ ,„p space (box 
,S Next,therender,ngda.aistrans,o™^~-^^^^^^^^^ 

510). Each vertex then is exam.ned «. dete™ ^^^^ ^.^ ^.^^^^^ 

ouiled (box 520). The culling process isdesonbed 



to the culling module 425 



30 



•.i.rti.rarded OthenMse, processing is 
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needed (box 540). ^^l' ^ ,3che to store data, the 

architecture o. the P'P*-^^;' 'por example, « input rendering data has 

processing by the lighting module. 

coordinate 'V^tem (NHCS) clip coo ,,,^.poi„tfonhat allows 

.,..e.enera«ona«^^^^^^^ 

computat^ns ^'^ ^'"^^ ^ is ..ncated. This 

'^"^^^^"XrMhrcSflxed.poi,*tor,.ata„ows™^^ 
3 r:::::::o:tap.oessin.power. .eNHOS.ed.poiot,cr.at,s 

described in more detail below. 

module 425 shown m FIG. 4. "^"^ .gMing module 430. 

of transfom^ed vertex informa*on before P'"^^^, ,„ me transform and 
Theplaoemen.o,thecu..,.^n»du.e42 .sin^r^ tjn^^^^^^^ 

,,ghtingpipeiineaisoiosedherein,«.ecu.ng^^^^^^^ 

. . nr, 6 the culling process of the culling module 425 begins 
Refemng to FIG. 6, the cuuing p ^ ^ ^^^^ 



20 
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■ h.ck face culling. Back face culling checks whether th. 
Wtype of culling is a back face cu g ^ ^ 

vertex forms a back f»e of a tnangle (box 620^ ^ ^ 

examined is kept (box 630). 

Uiangle, then «,e vertex is disca^led (box 640). 

r „t..mriinina View frustum culling 
..econa.Vpeofcuiiing.av.w^-^^^ 

.ecks^etbertf^eve^xis^u.^^^^^^^^^ 

NbGiBseiE^ntfiESHfioSS herein use a 

module an P^J 

„„,„3.edbomoger»us-^^^^^^^^ 

on the rendering data. NHCS .s a h^g ^ ^^^^^ 3 «ay to 

5 representation, in general, fixed-poim P epresenting a number 

.present a «oating.poin. number using J^^ ,,„ein in a 

Ja^oa^ng-point representation means^H^^^^^^^^^ 
«.edpos«on.lnstead,thedecimal«^^^^^^^^ 



Other hardware limitations. 



25 



30 



..altema«vels.ouse...po:.— 
executed using integer functk,ns. On -^^'^ 
platfonns, the CPU may not be P<-*' ^ ^^^^^ «oa.lng. 
„perat.nsandtheret,pica,,ya.j2^- 

,„ed.polntnumberrepresen.at.nisa.^^^^^^^^^ 
usesfloaUngpoin..Typica,ly,someof.heb,.sareuse. 
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^ available, a 16.16 oo^^;^^^^ ^' * ^^""^ 
decimal (represen«ng «.e whoie part of ^^-"""^l^ ^ 
(.p^sentingthef^^naipartcfe^nu^eO^ln^^-^^ 
65635.99998474121 1 «.e largest possible numb fo ^^^^^ 

Tl,is is ob^ined by se«ng the decimal ^^'^^^^^'^^''l ,5,36 is 

e5636wi.b16bitsisobtai„eaf.2w^^^^^^^^^ 

abided by 65536, then the value .99998J4 2 ^ ^ ^ 

part. Tbere a-e other variants such as 24.8 2 b«^b 

and 8 24 (8 bits before the decimal and 24 bits after). 
:;'C'ontheamou„tofp.c..n*atanappllcat.nneeds. 

.ane.em.a.embo.imento,theop«ed.^^^^^^^^^^ 

.odule and pipeline, DirectSD for n»blle po« 
, operateintheOaOMtrans..— ^'^^ 

numbers need to be converted 0 NHCS « ^^^^^^ ^^^^ 

conversion is easy as possible (so that the rang 
need to be known) while preserving the precsion of the data, 
number representation achieves these objectives. 

MHCSisatypeofvertexrepresentatton. NHCS can eliminate the 
20 NHCS is a type 01 V ^^.tesoace For example, without 

annoying overflow, and provides a "^'' J^ ^.e.^,. assuming that a 
NHCS, the model space vertex coordinates -^^^^^ ^ ^ ^ 

vertex coordinates range from 2 2 . tsy v 
25 both range and precision are greatly Increased. 

NHCSalsomakestheoonversionfrom«oating.poi.to«^^^^^^^^^ 

,.Jnecessa.to.ow.eexa..^— 
— "rrnXr— elfprovlga^aerdata 
30 :^IIg::--P-- .HCSalsopreservesantransfom^and 
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H m«kPs use of the "w" in homogeneous coordinate 
lighting operations and malces use ot xne 

representation. 

> no. / lo ^^^^niu The transform and lignting 

™dule and pipeUne shown ,n RG. 7 >ndud^ a ^„^^^„ ^ 708. I. 

° rr r ir^- . . .p— „ . o... 

(D3DM) rendering standard for embedded devices. 

15 working example, a flexible V .^..^^ted The following vertex 

necessary components of the vertex can be selected, 
structures are supported: 

typedef struct t_FVF { ^ vertex 

oo BOOL bFog; //Whether Fog component exists. Only 

BOOL bDiff; // Whether Diffuse component exists. 
BOOL bSpec; // Whether Diffuse component exists _ 
BOOL bXYZ; // Whether Coordinate component exists. 
BOOL bNorm; // Whether Normal component exists. 
05 intnTexNum; // Number of textures. 

intnTexCoord; // Number of texture coordinates. 
intnSize; //total size of a vertex 

//offsets in a vertex 

.* *«««■ //Offset of the Fog component, 

nt offFog, " ^ . 

30 ,n.oflD.«; ,/ Offset 0. the Diffuse oonnponan*. 

intoffSpec; //Offset of the Specular component. 
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. , // Offset of the coordinates component. 

int OTIAT^, 

^. // Offset of the Normal component. 
'"Itr ioffsetoftheTe^re — co^ponen.. 



} FVF; 



5 



10 



its offset for memory access. 

.pspaoe,-aNHCS«xea.pc»n..— 

pipeline will chaoK a vertex " ace NHCS vertex v«« be 

..nstom-e. before. ^^^ij^l^jl vertex. The .natHces and 

transformed by a matrix, M^to a NHl, P ^^^^^ 

,un*ns discussed in «.is «o,King --P'-'! „ry 
,etransfonnedda«lss.redinmeve^xc.^e7^^.S^^ 

.essa,esarealsos.oredin.Hevertex.^^^^^^^ 

— :re:=^^^ 
, ::;Ve:=^^^^^ 

..o.b.bepl.iineo....e*s...^^^^^^ 

are stil, useful for e«her .-^bSnB in v^ spa» 
.erearefwowavsfo.o-M— J^^^^ 
3pace.ov.wsp.e M^^^^^^^^ 

space by Mp ; and (2) transtorm ^^^^^^ 

rrT:rr:urnedo.,,.ebvpas.^^ 

30 does not affect the pipeline. 
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u m FIG 7 the transformation module uses the function 
non-NHCScUpspaoev^-^^^^ 

the culling is part of the transtotma ^^^^ 

env«(1 and View Frustum Cull 728 are useuiu 
Bacl(face_SFiX32() ana v le 

10 to the lighting module 704 to De m. » 

fogging on the vertex infotmation, if desired, 

. H„i. 704 computes color according to lighting parameters 
The ligh.ng modul 704 ~mpu ^^^^^^ ^ 

forverticesofthatarenon-b ckb«. Anj ^^^^^,„„„be 

1 5 normal or with diffuse/specular color. However, 

assigned; both cannot be assigned 

„,„„^,.lnpu«ed,thelightinglscalcul..^^^^^^^ 
Phongmod^T^eoutputcontalnsdiJuse.^-^^^^^ 

r:rr::.--ormto..^^^ 

incorrecti-^hang resuKs. Lighting m v,ew ^P^^^^^^^^^ , .p,,, 
^dewv^rid matrtx could be scaled asyn^metr^a ' t^ ren^ 
30 n»delspacein.oane,.ip.iain«.-J-^^^^ 
incorrectly when lit in model space. ThB occurs 



25 of 64 



10 



Attorney Docket No. MCS-04(M)3 

MSFT Matter No. 304843.01 

involved. 

HM.P 704 uses the following functions. The function 
The lighting mod 7^ ^^^^^^ ,,,t/view 

SumNorm_SFlX32QuadO 740 .s used information is 

,nput for the function Dot_SFlX1 6Tnp e() , ^^^^ 
„o™al.e.veoto..T.eaotp^u.^^^^^^^^ 

RGB color values are computed. Tn ^^cs fixed-point fom.at 

^..nsformfron, n^e, space tov,ewspacus^^^^^^ 

The out^t is a posH^n in view ^P«- ~ ,^3,,. output is a 
used to transfon. a normai from <^ ''^J^ J,,, ^ „gnUng 
„o,.a,inviewspace.a.er,u^o^^^^^^^^ 

.ete.re.enerat.onand,ransforma.nn^u^^^^^^^^ 
coordinates and the coordinate trans.om,s. n e^- ~ „ 

^ Flag of view space nonnal 
^ Flag of position 

:rrr::icteda«erte..recoo.nate.nera«on. 

.ete.ure.enerationandt— ationn^u.™-^ 

„.Portexture,eneration>t^— -^^^ 

TransQuad.SFlX32() 760 ,s used to sFiX32QuadO 764 is used 
usingaNHCS fixed-point fomiat. second, SubNorm. 



30 
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TransNorm.SFlXie 0 ^^^^ ,3„„,3.e reflection from «« 

TransQuad.SFlX160 to transform the texture coOrd,nates. 

TH .Prtex cache 716 contains the intermediate transform and 
The vertex cache calculation while rendering a 

„,ting results for reducing ^^^^ of a frame is 

single frame. The vertex cache 716 . reset w 

completed. In this working example, the vertex cache 

vertices. It is defined as 

#defineCACHESIZE32 



Each element in the vertex cache 716 is defined as: 



15 



typedef struct t_VertexCacheltem { 
UFIX8 flag; 
BYTE* pVtx; 
WORD idxDestVtx; 

20 int shift; 

SFIX32 w; 
BYTE* pTnlVtx; 
SFIX32Quad epos; 
} VertexCacheltem; 

25 



♦ 

30 



^^®^®* X. -I hit for transformed and 1 bit for lit. 

♦ flagcontains6bitscullingflags.1b.tfortransform 

if the vertex is clipping by view frustum. 
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.s««>s«,e«...orrec„v..e.ea,win,e«,n«nc.NHCSo,lpspace 

nnrtPd in this transform andlighting pipeline, 
vertices are supported in this ^^g^^,y 

coordinate in pTnlVtx but ,t ,s non-NHCS. 
again. 



necessary. 



25 



30 



, also is performed in the transform and lighting 

View frustum clipping 780 also IS pe ^^^^inates The 
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the following tables: 



10 



15 



♦ 
♦ 
♦ 
♦ 
♦ 
♦ 
♦ 
♦ 



le 

SFIX64: 
UFIX64: 
SFIX32: 
UFIX32: 
SFIX16: 
UFIX16: 
SF1X8: 
UFJX8: 



signed 64-bit integer 
unsigned 64-bit integer 
signed 32-bit integer 
unsigned 32-bit integer 
signed 16-bit integer 
unsigned 16-bit integer 
signed 8-bit integer 
signed 8-bit integer 



StructyretyEe SFIX64Quadl41 

♦ ^P'''' • 'Tdl store a 4-element vector, and each element 

,0 Thisdatastructure.su^^^^^^^^^ 
is a 64-bit signed integer. This vector ca 

SFIX64Triplel31 

♦ typedef ''^^1 store a 3.element vector, and each element 

This data structure .s used to store a 

• ^intener This vector can be either iNnv. 
25 is a 64-bit signed integer, mi* 

SFIX32Quadl41 

♦ typedef SFKM ^^^^^ ^ „^ent 

This data structure is used store a 4 e ^^^^ 
,s a 32.bK Signed integer. This vector can be ertherNH 

30 SFIX32Tripl9l31 
♦ typedef 
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is a 32.bit Signed integer. This vector can be e,ther NHCS or 

SFIX16Quadl4l 

♦ typedef . h,„^„ a 4.element vector, and each element 
Thisdatastnjctureisusedtostorea4eien 

5 inisuoio „„. ho oBher NHCS or non-NHCb. 

is a 16-bit signed integer. This vector can be erther nm 

SFIX16 SFIX1«Ti1plel3l 

* . ■ ,„edtostorea3-elementvector,andeachelement 
This data stmctureis used to store a d „n-NHCS. 

,0 isate-b^signedinteger. This vector can be e,ther NHCS or non N 

UFlX8Quad[4] 

♦ typedef , ".^store a 4-element vector, and each element 

This data stn^cture ,s used to store a 4 el 
8-bit unsigned integer. This vectons non-NHCS. 



IS an 



15 mainly for representing c*rRGBA<»mponents. 

SFlX32MaMx4 SFIX32liei; 

* .,,.«.o,ea16-elementmatrix,whichis4by4. 
This data structure is used to store a 16 elem 
Eacheiementofthen,athxisa32.b.unsigned,nteger. Th,sma.nxca 

20 NHCS or non-NHCS. 

^^«^antissab«s,istedhe.arefor.ed-pointdata 

representation: 

r>cc A. 11 T SFIX32 16 //default mantissa bits for 32-bit 
^ #define DEFAULT_SFlAJ^ 

nMP SF1X32 30 //mantissa bits for 32.bit signed 

4 #define 0NE_SF»X3Z 

^''^MORMAL SFIX16 14 //normal mantissa bits for 16-bit 
30 ^ #define NORMAL_SFlxio 

signed 
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4 #defineTEXTURE_SFIX16 

coordinate 
^ #defineONE_UFlX16 

within (0-1) 
^ #define C0L0R_UFIX16 



12 //mantissa bits for 16-bit texture 
15 //mantissa bits for16-bit unsigned 

8 //color mantissa bits for 16-bit 
unsigned 



'^Hecons.n. listed he. are,o.n.e,ers«ngauHngcon,pu.«on an. 
10 conversion beween different data formats- 

♦ constSHX32 ^^'^''''''''^'Zl-^^- 

«uc eciX32 i=(SFIX32)1«ONE_SFIX3Z, 

♦ const SFIX32 °NE.8F X32J (SF I ^ ^^^^^^^^ 

♦ oonstSFIXie --^^ ^^^^^^^^ 

" : N0rL=N0RMA..SF.Xie-TEXrUR..SFIX16., 

The basic operations f.ve the ^^'^ — ' 



20 



'^^.ac^sareoonversionmac^.forconvertin.be.een 
different data formats: 



25 



30 



4 #defme PosToTex(a) 
4 #define NormToTex(a) 
4 #def.ne FloatToSFIX32(a,n) 
4 #define SFIX32ToFloat(a,n) 
^ #defineFloatToSFIX16(a,n) 
4 #define FloatToUFIX16(a,n) 
4 #define SFIX16ToFloat(a,n) 
^ #defineFloatToUFIX8(a) 



((SFiX16){(a)»P0ST0TEX)) 

((SFIX16)((a)»N0RMT0TEX)) 

((SFIX32)((ar((SFIX32)1«(n)))) 

((float)(a)/((SFlX32)1«(n))) 

((SFIX16)((ar((SFlX16)1«(n)))) 

((UFIX16)((ar((UFlX16)1«(n)))) 

((float)(a)/{(SFlX16)1«(n))) 

{(UFIX8)((ar255)) 
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^«,cw,n,n,ac.sa«co.puu«on.ac«>s.o.co«.e.ee„ 

fixed-point data: 



10 



4 #define MuLSFIX32(a,b,n) 
> #define MuLUFIX32(a,b,n) 
4 #define Div_SFIX32(a,b,n) 
4 #define ^AuLSFIX16(a,b.n) 
4 #cleflne^AuLUFIX16(a,b,n) 
^ #defineMuLUFIX8(a,b,n) 



((SFIX32)(((SFlX64)(ar(b))»(n))) 
( (UFlX32)({(UFIX64)(ar(b))»(n)) ) 
( (SFIX32)(((SFIX64)(a)«(n))/ (b)) ) 
((SFlX16)(({SFlX32)(ar(b))»(n))) 
((UFlX16)(((UFlX32)(ar(b))»(n))) 

(((UFIX16)(ar(b))»(n)) 



15 In put data 



AAantissa bits 
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Output data 

Name 

" Transformed vertex 
coordinates (x.y.z) 


Typo 
SFI)«2 


Mantissa bits 

one3f>^^32 

nPFAULT S1-IX32 


Transformed vertex | 
coordinates (w) 


SF1X32 




Color 

Texture coordinates 
l-og 


DWORD with 
A8R8G8B8 

SFIX16 

SRX32 


— fixfURTSR)^^ 
DEFAULT^FWWT" 



intemiediate data's type and mantissa 



bits are iisted witiiin each function. 



OetaLsCeaCChe 3boveda,atypesis«s.ea below. The raasonwhy 
such data types and «>e mantissa bits «ere chosen a.e expiated. 



10 



yghting 



p^eifinn/Direction 

Light position or direction is tal^en as 

U^hT^SiittoTF^^ 



15 



NHCS 



Thisrep—np^v^estheenoughrangeana precision^ 



and no extra cost exists companng 
non-NHCS. 



ina with the traditional representation such as 



Viewpoint 

20 Viewpoint is represented as: 
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NHCS. 



I iqhtin q COlor 

Lighting color includes: 

4 Ambient. 

4 Diffuse 

-10 ♦ Specular 

Their representation is: 



of color in D3D in A8R8G8B8 style. 



This presentation is a natural expansion 



15 



Material property 

Material color includes: 
4 Ambient. 
^ Diffuse 
20 ♦ Specular 

Each of them is represented as: 



of color in D3D in A8R8G8B8 style. The 



This presentation is a natural expansion 
25 power component is represented as: 

"p^weTcompo^ UFIX8 



No mantissa 
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,n one embodiment o, *e NHCS graphics rendering system 100. the 
po«,r is assumed to be an integer from 0 to 1 27. 



Normal 



Nomialistalsenas; 



10 



15 



^ormr]JF«Xir]J^^^^ 
P.m empiricai evMence. it is conCuded that a 1«« nonnaUs enough for 

::rrra:— -prese.edasin.erpa.,orno™ai 
coordinates like 1.0 or -1.0. 

jpvtf"-** ordinate 
Texture coordinate is represented as: 




20 



25 



,„ a prefened embodiment, the TEXTURE_SFIX16 ,s equai to 12^ 

. hit for sign and 3 bits for an integer pat. Th,s provides 
Further, there ,s 1 M fo s,g a d ^ ^^^^^ 

sub-pixel resolution. 

nnt piitver^ ^^ ^"ordinate 

suitable for a vertex shader. The representatK>n ,s: 

"SFIX32 l0Nt_SFIX32manus»a' 
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x/aii IP for X V will be within (-1,1). and 

. /^Klc Qciy^J isaivenasSOanddoesnotsunerTruin 
iswhyONE.SFlX32.sg«e ^ .e-bitfracUon anda 16-bltinteger «a 

component is not noFmaleed in (-1 1 )■ « ' 
good t>a.ance between the p,eolsion and range of w. 



Matrices 



"oMo.ndert„g,seve..™tHc»sHou.be.eadv— sareof 
Model spac e ^"'""■^d space 

* /n n 0 lf No error is returned, and it a user 
r:rl'n;nte.u. — and,og«be>nco.eot. 

World npnr°*"^''*^^sP3^®- 
20 Mv: T-ansfonn matrix from wohd space to view space 

currently a D3DM implementa«on assumes tt,at the last coiumn of «,is 
.. Noenorisre.urned,and«userspecif^amatnx«,th 

:r:Sil-ureooordinateand,og.wi«.einco.ct. 
Mp: Projection matrix from view space to clip space 
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C„n.n«y. a D3DM i.ple.en.a^on assu.es ,hat .he ocj J of 

column shoukl be (0,0,1,0)' to give a conaC «- value. This ,s calle 
friendly projection matrix. 

^ Mndel spar°*"^^ifiw space 

M„: Matrix oombihaton from model space fc. view space 

M1nflnnpifi'-°*"''-"P^°^°^ 
1 5 M^: Matrix combination from model space to cilp space 



03DMimp.ementa«oncomblnesthematrices«.,M.and«,.The^^umn 
ofZa.r^lsde.e,mlnedbvtheparame.erso,thesema.nces.Noerr„ns 



A 

o1 

20 retunned 



'^^''^SS^Mib..in.udesma..ema«.lope.^^^^^^^ 
n,nClons. The mathema«cal library now will be discussed ,n detail. 

pt^atiirp. division 

The features of the mathematical library are divwed into features ma. are 

H d mirizer resource managemen., and feafures supposed by 
:::rar:r;U-ma.ema.ca,,ibraryimp,emen.all,ea.res 

30 supported by T&L. 
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F^o4..roc Rtip ported i" thft Rasterizer 
The following features are features in the mathematical library that are 
supported by the rasterizer: 

♦ point, line list, line strip, tri list, tri strip and tri fan rendering 
5 4 Point, wireframe, solid fill 

4 Flat and Gouraud shading 

♦ Depth test with various compare mode and pixel rejection 
^ Stencil compare and pixel rejection 

^ Depth buffer-less rendering is supported as well 
10 ♦ W buffer support 

♦ MipMap textures are supported (Interpolate) 

♦ 8 stage multi-texture with D3D8 fixed function blending options 

♦ Point, linear, anisotropic, cubic and Gaussian cubic texture filtering 
4 Alpha blending (with several blend modes) 

15 ♦ Palletized textures 

4 Perspective correct texturing (not on by default) 
♦ color channel masking (COLORWRITEENABLE) 
4 Dithering 

4 Multisampling for FSAA 
20 ♦ Texture address modes 

^rr^\ur1-- Q..p pnrtPri in Rr — Management 
Resources are objects that are resident in memory, such as textures, 
vertex buffers, index buffers and render surfaces. Resource manageme^ .s 
25 rr^anagementofthevariousmemoryoperationsontheseobjects. These 
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library mat are supported In resou..e management: 

♦ Swapchaincreationandmanagementfordisplay 

♦ Depth/stenoil buffer ceation and management 

♦ vertex buffer creatton and management 

♦ index buffer creation and management 

♦ Texture map creation and management 

♦ Many texture fonnats induding DXT compressed texture 

♦ Scratct, surface creation/managementfortextureupioad 

♦ MipMap textures are supported (Build) 

♦ Dirty rectangular texture update mect«nism 

♦ All buffers lockable (assuming driver supporU) 

n Ilium- '^■irr'""""'''^'- 

w„res in the mathematical libranrthat are 
The following features are features in the ma 

supported by in T&L: 

« Texture coordinate generation 

♦ View, Paction and wortdtransfom, matrices 

♦ Singletransfonnmathxpertexturecoordinateset(8setsmax) 

♦ up to 4 dimensions per texture coordinate set 

♦ Amblent/diffuse/specular lighting and materials 
4 Directional and point lights 

4 Back face culling 
25 ♦ Fog (depth and table based) 



20 
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^I^^BonsjiT ilexftd by fea tMres ,.„„. :„rt„xed by features are 

^1, section the mathematical ftinctions indexed oy 
. H rt^ns cover tran^rm, culling, lighSng, culling, texture and 
descnbed. The functions co ^^^^^^^ 
5 other miscellaneous functions. In addrtion, the 

(resolut^n loss) p— of these functions are discussed. 



NHCS vecto rtraasiQnil 

NHCS vector c by matrix m. 

^^^^^^'^'^I^^^^ m NHCS fom«t 

Transfom, matrix In SFlX32Mat4x4 and DEFAULT.8F,X32 
format. 

output vector after transfonn in SF1X32 format in NHCS 

bits c to 32-bits NHCS c. 



Return 
value 



^^=^^^^^^S:possibleintem,edlafevaluels:4X^^^^^^ 

:^Z. 0000, = OX 1 0000 0000 0000 0000. This indicates 
L a 64.bits intermediate value will have overflow in the 
intennediate data before NHCS. 
« Underflow: 

Appears when ,runcatedf™« 
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M^triv rnmbination 

SFIX32Mat4x4 m3, UFIX8 n) 



>arameters 



Return value 



Remarks 



m1, m2 

Input matrices in SFIX32Mat4x4 

"nput shrft bits for shifting the 64-bits multiplication results to 
32-bits results. 
m3. 

Output combined matrix. 
^4o return value 

T Shift . 

The matrices ml, m2, m3 can have different mant,ssa b,ts. 

suppose m1 with a bits mantissa and m2 v^ith b bits 
r^antissa, to get a c-bits mantissa m3. we should set n - 

fa+b)-c 
4 Overflow: 

The maximum possible in.em«dia.e value is: 4-(Ox8000 
0000-0x8000 0000) = Ox 1 0000 0000 0000 0000. Th,s 
indicates ^ a ^ intem,ediate value will have overflow 
in the intemiedlate data. When tiuncating the 64-brts 
intermediate result to 32-bits output, overflow is also possible. 
4 Underflow: 

Appears when truncated^ro^^ 



^l^n-^lHCS vector transform. 




void TransQuad J 
c) 
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leSST^^^^^Tto^^'ie^^ vector. 



Parameters 



Input vector in 
mantissa. 



SFIX16Quad with TEXTURE_SFIX16 bits 



Return Value 



m 

Transform matrix in 
format, 
c 



SFIX32Mat4x4 and DEFAULT_SFIX32 



output vector after transform in SFIX1 6 fom,at wUh 
TEXTURE_SFIX16 bits mantissa. 
No return value. 



Remarks !▼ tpyti ire SF1X1 6 mantissa. 

Appears when go out range of TEXTURE_bMA 

4 Underflow: 



Appears when go out range 



of TEXTURE_SFIX16 mantissa. 



void TransNorm_- 



Parameters 



inpu. ve^r in SF:X16Tnple with N0RAML_SF.X16 bits 
mantissa. 

* ■ ■ <;Piy'^2Mat4x4 and DEFAULT_SFIX32 
Transform matrix in SFIX32Mat4x^ anu 

forniat. 

Outputvectoraftertransform.itisinSFIX16formatwith 
N0RMAL_SFIX16 bits mantissajTorr^^ 



Return value 



No return value. 
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Remarks 



^ Matrix .^ iciiQpd 

Fortransfom, nonna.. only the upper 3x3 pari of m . used. 



NHCStojiorvNHCScoffi^^ 

. ^- HoH from the NHCS vertex. SF1X32. It is the 
Input w to be divided from the inho 

b[3] in TransQuad_SFIX32(). 
CL.«bns.*.mfron,TransQua<..SF.X320.For 
calculating the conrectw 
Iput vertex anerTransQuad.SFlX320. NHCS 

K.urci qFIX32 format. cc[0hccl21 has 
output vertex with non-NHCS SFI)G2 fo ^^^^^^ 

ONE_SFIX32 bits mantissa, and cc[Z] has Dh.-/^ _ 
bits mantissa. 
^i51Retum'^^e 

Mho flrtual clip space vertex from 
♦ With this function we get the actual CMP P 

NHCS Clip space vertex for finally converting to float point 
vertex and outputtovert^^ 



Return value 
Remarks 



^TT;!frsRX32Qi^^ 

V^idDivW^^ 
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Parameters 



I Remarks 



rnpu. w .0 be dWed iror. .he NHCS vertex, SF,X32. U is the 
b[3] in TransQuad_SFIX32(). 

Iput shifted l«s return from TransQuad.SFIX320. For 
calculating the correct w 

Input vertex after TransQuad.SFIX32(). NHCS 

output vertex with DEFAUU^FI)02^ 
No Return value 

♦ This function is used in texture coordinate generafon 
from view space position, so the precision and range .s 
different from DivWW_SFlX32 above. 




Remarks 



♦ NHCS is used to compress the operand from 32-bits to 
16-bits since we only need the sign. 
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Vi^x^/fniRtum culling 
V,ew..sU,.cu,.ingran,oves.he«angles Whose vertices areoutsideo, 

one vie« frustum plane. View frustum involves 6 planes: 
5 ♦ Left plane. 

4 Right plane. 

4 Top plane. 

4 Bottom plane. 

4 Near plane 
10 ♦ Far plane. 

,U„X3.setto.o«e«a.sforcu,,in..P10.~^^^ 
.p,ement.ionofa.u,ferto^o-n.P^^^^^^^^^^ 
unXMo^atbuffertosto^^ec^.^^^^^^^^^ 
15 in clip space. If it Is assumed that 0 is 
algorithm is; 

SFlX32Quad b; // NHCS clip space coordinates 
UFIX8 f=0; 
20 if (bl01<-bl31) 

f 1= 0x01; 
elseif(bl01> b[31) 
f 1= 0x02; 

if (bl1l<- blSl) 
f 1= 0x04; 
elseif(bl1l> b[3]) 
f 1= 0x08; 

if(bl21<0) 

f 1= 0x10; 
elseif(bl21>b[31) 
fl=: 0x20; 



25 



30 
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„«,^.ags.or each vertex are obtained,an"AND-ope.«on can t« 
used .0 test whemer *e flags a. outside o. me same plane. 

T.e.ag is also usen.,Mhevertexcaohe.and«,e2unused bits Will 

'"T— eds.tusOndica.swhe«,eravertex.asbeen^^^^^^^^ 
♦ u status (indicates whether a vertex has been M). 



10 



15 



lighting model used is the Pnong ^^.^^^^ ^.^^^^ 

space. A material should be assigned to the object, 

. ,or nrnoertV iS denoted as MAmWent, NlDiffuse. Mspecu.ar and Mpower 

specular, power property 3„d 

respectively. In D3D. NlAmblent. Momuse. Mspecular 

each component is a float within [0-11. 

Each component only need be represented as: 

-7-1 uPm rTbits mantissa' 

l]ihti?igram^ner^ 

The c„.cr o. iighting is noted as U,.^ X«t 

. M L and V which represent vertex normal, vertex iigm 

nonnalized vectors N,(.anov,wn k ^„ „, g vertex can be 
direction and vertex-view directbn respeCveiy, the coior 



calculated as: 



25 



ited as: (}JmT1\^'°*" 

,OS.eAandSBi,,us...ea„exemp,a.in,.^^^^^^^^^^ 
vec.orsinaD30MPhongMode,.As^^^^O^A.^^^ 
vertex to light, and N is the vertex nomral. R is the rati 



46 of 64 



Attorney Docket NO. MCS-04M3 

MSFT Matter NO. 304843.01 

• cir QR V is the vector from vertex to 
hv N As shown in FIG. 9d. v 's 
which is symmetric to L by N. as 

view point. andHis the half vector ofL+V. 

«as Chosen. However, this cho,ce also b„ gs «.bte ^^^^ 
oo„«nsshea.anasc*.Mh<««2«^^ 

rendering system. 




normalize 

Parameters 

Return value 



a 



Un-normalized input in SFlXl^inNH^ 
Invert length in SFIX32 



Remarks 



mantissa. Moes not matter M2.n>32, because. he 

calculation does not use n explicitly. 
2on'snerat.onn,ethod.useahere.orso.ingthe 

invert square root, usinga256*nlo^^ 



15 



47 of 64 



MSFT Matter No. 304843.01 



Attorney Docket No. MCS-040^ 



Parameters 



Return value 



Remarks 



a 



Un-normalized input in SFIX16 in NHCS 

Normalized output in SFIX16 fom^at with N0RMAL.SFIX16 




VW^UiiSRX32t^^ 

result to prevent overflow and keep precision. 



Nlnrmalizp^'"" MHCS Vector 



Void 



Parameters 



Return value 



Remarks 



Un-normalized input in SFIX16 in NHCS 

Normalized output in SFIX16 format with N0RMAL_SF1X16 

mantissa . 

No return value 

Snx32tohdd the intermediate i nplel.i^Len ()" 



4 We use 

result to prevent overflow. 
^ It is used in normalization of directional light. Gives a 
normal L from vertex to lighting source. 



niihtrnrtinn nf T-"^ ^^^-^ Vectors 

SFIX16Triple c) 



Parameters 



a,b 



Input vectors in SFIX32 with NHCS format 
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mantissa 




It is used in 
direction L when using point light 



This function returns 



Parameters 



iLalized input in SF.X16 w*h DEFAULT_SF,X16 bits 



Return value 



Remarks 



mantissa. 



^ If the two vectors are 

all because the result will be within (0-1 ) 
4 Value that less than 0 is clannpedtoO. 

Power 

Power base with 0NE_UFIX16 bits mantissa. 
n 

Power exponential within 0-127 



[Return value 
1 Remarks 



^rw^^i^th^SH^^ 

multiply we need. 
♦ ,n .^ndering pipeiine the n can l=e f,xed. We use sta*c 
variables to store the n and its efficient dign. If " ;s «he 
same in the consequen«aic^^ 
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same as 



■^fi^^toU^^ calculated again. 



Half Vector 



The half vector is used to approximate the actual 
cosr - (N - HI for calculating the specular component. H can be calculated by 
5 the normalized L and V; 



L and Vare represented by SF.X16Tr.ple with NORMAL.SF.Xie ^ 
.antis a TO avoid overflow and Keep precision, they are flrst added together as 
,0 Ts™., Ne.,thehaHvec.orHisn,ade,nNHCSSF.xmnp.e,andH 

then is nonmalized. 



^-nS^S^e...nuses.ewspaceno.a.po.— 

15 ,0 generate the texture coordinates in each vertex. View space normal and 
' ;rnisavai,ableafter,ightlngln.ewspace. However, re,ect.n vectors 

need to be calculated here. 



poflortinn Vector fr-"- and View 




Parameters 



normalized normal in SFIX16. N0RMAL_SFIX16 
viow 

nomralized view direction in SF1X16, N0RMAL_SFiX16 



reflect 



Nonnallzed output in SFiX16 format withNORMAL^^ 
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I II ir n riiii ^pnrr Cn^r^-^'^^ ^'"PPi^Q Algorithm 

.ansfor. can be co.b,ned 

5 into a 4x4 matrix 



(1) 



onH ic<:imiiartov.z. In fact, the term is the 



The term is defined, and is similar to y. z 

10 „orma,.e<.sc.en space coo^inates. This assun,es.heco™ctwp is obtained 



-Lp -' 



for each vertex. Multiplying (1) by ( ). V'^- 

w. w, "^J 



(2) 



EcuaHon (2) is a linear e<,ua«on, which indicates that 1Mp can be iinea^y 
interpolated. Given three vertices and three texture coordinates; 
15 and("' •)(Ni,2.3)foratriangle,thereexistsanaffme.ransformwhich 
.apstexturecoo,.inates.oobiectspace,»thetHangleisnotdegenerated: 

(a V iH^-(^ y ^ 1) (3) 
20 combining (3) and (1), both sides are divided by the wp, and thus: 
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(4) 



Where 



B — ^3x4-^4x4 



5 



Equation (4) indicates U/WP.V/WP can be interpolated linearty. For 
pejloonllte^re capping, anernnea^vlnterpoiatinauMp,.^^^^ 
Ip the correct texture coordinates can be computed for p.,ecuve^n-t 
texture mapping. 
10 The algorithm for interpolating between two points is: 

Input: point^ 

Clip plane * 

o^intl^p will satisfy: 

^5 The intersection point ^ " 

l/w,=l/w,,+(l/>va,-l/w,p)^ 
Take into Clip plane, yields: 

^ 



20 



Then: 
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10 



15 



1/W =l/W„+(l/W3p-l/>^.p)' 



And, 



After NHCS transform, gives: 
which gives: 

(X V, ,Z,„,W,.) = — ^(^i»P»:>'l"P'^>"P''^»"P^ 

i.x,p,/lp, ip. \pf ^^^^^ 

(X V, ,Z2„.W2p) = — ^ {X^„p,y2np'^2«p>'^2«p) 

w.«n (^w'3'w'2.v'^^^p^ becomes: 
Thus, the final representation of 
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(^\^\np'^2np ^ 2np inp / — — ^ \ 



And the representation of , ..h„ +cz ) 



>'p = 



10 



15 



20 



,„ case the new intersection point^llparUo,pate.n,rthero,ipplng,noan 
bewritteninNHCSton.: .,.^,^,,.(,^.„, 

He.,Cis.heshme<, bits andwisthe weight, and theinterpolMe 
parameter is; 
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- Winfn \^\np , j... \ 



10 



Mjscenane o i is Function s diocussed in the previous 

^^;^;^;;n^unctto that have not been discussed .n P 
Tnerear« nhCS functions that perform NHCS 

These functions will now be discussed. 



^ 




Parameters 


Ipu. integer, uns-^nedS^Hs integer in UF,X8fom,a. 


Return value ^ 
Remarks 


piSimdiif^nhii^^ 

UFIX8 format. 

Using Bisearch algorithm 1 




Parameters 



Return value 



Remarks 



, signed 32-bits integer in SFlX32fomiat 



Input integer 
Effg^i^tdigitdtheintege^^ 

UF1X8 fonnat. 
Uiii^g Bisearch algorithm 
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ralriilatfi Efficient nigits in SF1X64 
UFIX8 EfriDiglt_SFIX64(SFIX64 a) 



Thi^ction calculates efficient d.gits i.i a., i.1 1X64 integer 



Parameters 



Input integer, signed 64-bits integer in SFIX64 fomiat 




P^p.,..oi.n from SFlX64Qind to SFIX3?Quad NHCS 
int NHCS_SFIX64auad (SFIX64QUSd^:^^ 



This functions convert from non-NHCS to NHCS 



Parameters 



Return value 



Remarks 



input integers, signed 64.bits Quad, in SFIX64Quad format. 
Output integers, signed 32-bits Quad, in SFIX32Quad. NHCS 



format. 



AiTifiti^er records shift bitlf.^4-bit non-NHCS to 32-bit 



NHCS. 

♦ NHCS_SFIX64Quadisusedintra7ifomln transform, 

we need not shift when efficient digits of maximum 
component are less than storage bits. 
^ In clip space has either NHCS or non-NHCS. For 

recovering the correct w. it needs to record the shift bits. 



56 of 64 



Attorney Docket No. MCS-040-03 



MSFT Matter No. 304843.01 




Parameters 



Return value 



Remarks 



input integers, signed 64-bits Triple. non-NHCS 



output integers, signed 16-bits Triple. NHCS 
No return value 

NHCS_SFIX64Triple is used in iignung before 
normalization. Either efficient digit of maximum 
component is less than storage bits or not. we need shift 
to preserve precision, 




Remarks 



~NHCS_SFIX32Triple is used in lighting before 
nomialization. Either efficient digit of maximum 
component is less than storage bits or not. we need shift 
to preserve precision. 
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The foregoing description of *e invention has been presented for^e 
pu^ofiii— anddescript^altisnotintendedtobeex^u^^^^^ 

Jthe invention to the precise fonn disck>sed. Many 
Iltionsarepossibieiniightoftheaboveteaching. it is intended mat the 

the inven«on be ,in,«ed not by this detailed descript.n of the ,nvent,on, 
but rather by the claims appended hereto. 
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